SwePub
Tyck till om SwePub Sök här!
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "db:Swepub ;pers:(Jantsch Axel);srt2:(2005-2009);srt2:(2008)"

Sökning: db:Swepub > Jantsch Axel > (2005-2009) > (2008)

  • Resultat 1-10 av 19
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Al Khatib, Iyad, 1975- (författare)
  • Performance Analysis of Application-Specific Multicore Systems on Chip
  • 2008
  • Doktorsavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • The last two decades have witnessed the birth of revolutionary technologies in data communications including wireless technologies, System on Chip (SoC), Multi Processor SoC (MPSoC), Network on Chip (NoC), and more. At the same time we have witnessed that performance does not always keep pace with expectations in many services like multimediaservices and biomedical applications. Moreover, the IT market has suffered from some crashes. Hence, this triggered us to think of making use of available technologies and developing new ones so that the performance level is suitable for given applications and services. In the medical field, from a statistical viewpoint, the biggest diseases in number of deaths are heart diseases, namely Cardiovascular Disease (CVD) and Stroke. The application with the largest market for CVD is the electrocardiogram (ECG/EKG) analysis. According to the World Health Organization (WHO) report in 2003, 29.2% of global deaths are due to CVD and Stroke, half of which could be prevented if there was proper monitoring. We found in the new advance in microelectronics, NoC, SoC, and MPSoC, a chance of a solution for such a big problem. We look at the communication technologies, wireless networks, and MPSoC and realize that many projects can be founded, and they may affect people's lives positively, as for example, curing people more rapidly, as well as homecare of such large scale diseases. These projects have a medical impact as well as economic and social impacts. The intention is to use performance analysis of interconnected microelectronic systems and combine it with MPSoC and NoC technologies in order to evolve to new systems on chip that may make a difference. Technically, we aim at rendering more computations in less time, on a chip with smaller volume, and with less expense. The performance demand and the vision of having a market success, i.e. contributing to lower healthcare costs, pose many challenges on the hardware/software co-design to meet these goals. This calls upon the development of new integrated circuits featuring increased energy efficiency while providing higher computation capabilities, i.e. better performance. The biomedical application of ECG analysis is an ideal target for an application-specific SoC implementation. However, new 12-lead ECG analyses algorithms are needed to meet the aforementioned goals. In this thesis, we present two novel algorithms for ECG analysis, namely the Autocorrelation-Function (ACF) based algorithm and the Fast Fourier Transform (FFT) based algorithm. In this respect, we explore the design space by analyzing different hardware and software architectures. As a result, we realize a design with twelve processors that can compute 3.5 million arithmetic computations and respect the real time hard deadline for our biomedical application (3.5-4seconds), and that can deploy the ACF-based and FFT-based algorithms. Then, we investigate the configuration space looking for the most effective solution, performance and energy-wise. Consequently, we present three interconnect architectures (Single Bus, Full Crossbar, and Partial Crossbar) and compare them with existing solutions. The sampling frequencies of 2.2 KHz and 4 KHz, with 12 DSPs, are found to be the critical points for our Shared-Bus design and Crossbar architecture, respectively. We also show how our performance analysis methods can be applied to such a field of SoC design and with a specific purpose application in order to converge to a solution that is acceptable from a performance viewpoint, meets the real-time demands, and can be implemented with the present technologies while at the same time paving the way for easier and faster development. In order to connect our MPSoC solution to communication networks to transmit the medical results to a healthcare center, we come up with new protocols that will allow the integration of multiple networks on chips in a communication network. Finally, we present a methodology for HW/SW Codesign for application-specific systems (with focus on biomedical applications) that require a large number of computations since this will foster the convergence to solutions that are acceptable from a performance point of view.
  •  
2.
  • Grimm, Christoph, et al. (författare)
  • C-Based Design of Embedded Systems - Editorial
  • 2008
  • Ingår i: EURASIP Journal on Embedded Systems. - : Springer Science and Business Media LLC. - 1687-3955 .- 1687-3963. ; :1, s. 243890-
  • Tidskriftsartikel (övrigt vetenskapligt/konstnärligt)
  •  
3.
  • Khatib, Iyad Al, et al. (författare)
  • A multiprocessor system-on-chip for real-time biomedical monitoring and analysis : ECG prototype architectural design space exploration
  • 2008
  • Ingår i: ACM Transactions on Design Automation of Electronic Systems. - : Association for Computing Machinery (ACM). - 1084-4309 .- 1557-7309. ; 13:2, s. 31-
  • Tidskriftsartikel (refereegranskat)abstract
    • In this article we focus on multiprocessor system-on-chip (MPSoC) architectures for human heart electrocardiogram (ECG) real time analysis as a hardware/software (HW/SW) platform offering an advance relative to state-of-the-art solutions. This is a relevant biomedical application with good potential market, since heart diseases are responsible for the largest number of yearly deaths. Hence, it is a good target for an application-specific system-on-chip (SoC) and HW/SW codesign. We investigate a symmetric multiprocessor architecture based on STMicroelectronics VLIW DSPs that process in real time 12-lead ECG signals. This architecture improves upon state-of-the-art SoC designs for ECG analysis in its ability to analyze the full 12 leads in real time, even with high sampling frequencies, and its ability to detect heart malfunction for the whole ECG signal interval. We explore the design space by considering a number of hardware and software architectural options. Comparing our design with present-day solutions from an SoC and application point-of-view shows that our platform can be used in real time and without failures.
  •  
4.
  • Liu, Ming, 1982- (författare)
  • A High-end Reconfigurable Computation Platform for Particle Physics Experiments
  • 2008
  • Licentiatavhandling (övrigt vetenskapligt/konstnärligt)abstract
    • Modern nuclear and particle physics experiments run at a very high reaction rate and are able to deliver a data rate of up to hundred GBytes/s.  This data rate is far beyond the storage and on-line analysis capability. Fortunately physicists have only interest in a very small proportion among the huge amounts of data. Therefore in order to select the interesting data and reject the background by sophisticated pattern recognition processing, it is essential to realize an efficient data acquisition and trigger system which results in a reduced data rate by several orders of magnitude. Motivated by the requirements from multiple experiment applications, we are developing a high-end reconfigurable computation platform for data acquisition and triggering. The system consists of a scalable number of compute nodes, which are fully interconnected by high-speed communication channels. Each compute node features 5 Xilinx Virtex-4 FX60 FPGAs and up to 10 GBytesDDR2 memory. A hardware/software co-design approach is proposed to develop custom applications on the platform, partitioning performance-critical calculation to the FPGA hardware fabric while leaving flexible and slow controls to the embedded CPU plus the operating system. The system is expected to be high-performance and general-purpose for various applications especially in the physics experiment domain. As a case study, the particle track reconstruction algorithm for HADES has been developed and implemented on the computation platform in the format of processing engines. The Tracking Processing Unit (TPU) recognizes peak bins on the projection plane and reconstructs particle tracks in realtime. Implementation results demonstrate its acceptable resource utilization and the feasibility to implement the module together with the sys-tem design on the FPGA. Experimental results show that the online track reconstruction computation achieves 10.8 - 24.3 times performance acceleration per TPU module when compared to the software solution on a Xeon2.4 GHz commodity server.
  •  
5.
  • Liu, Ming, et al. (författare)
  • ATCA-based Computation Platform for Data Acquisition and Triggering in Particle Physics Experiments
  • 2008
  • Ingår i: 2008 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE AND LOGIC APPLICATIONS, VOLS 1 AND 2. ; , s. 287-292
  • Konferensbidrag (refereegranskat)abstract
    • An ATCA-based computation platform for data acquisition and trigger applications in nuclear and particle physics experiments has been developed. Each Compute Node (CN) which appears as a Field Replaceable Unit (FRU) in an ATCA shelf, features 5 Xilinx Virtex-4 FX60 FPGAs and up to 10 GBytes DDR2 memory. Connectivity is provided with 8 optical links and 5 Gigabit Ethernet ports, which are mounted on each board to receive data from detectors and forward results to outer shelves or PC farms with attached mass storage. Fast point-to-point on-board interconnections between FPGAs as well as the full-mesh shelf backplane provide flexibility and high bandwidth to partition algorithms and correlate results among them. The system represents a highly reconfigurable and scalable solution for multiple applications.
  •  
6.
  • Liu, Ming, et al. (författare)
  • System-on-an-FPGA Design for Real-time Particle Track Recognition and Reconstruction in Physics Experiments
  • 2008
  • Ingår i: 11TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN - ARCHITECTURES, METHODS AND TOOLS. - LOS ALAMITOS : IEEE COMPUTER SOC. ; , s. 599-605
  • Konferensbidrag (refereegranskat)abstract
    • In particle physics experiments, the momenta of charged particles are studied by observing their deflection in a magnetic field. Dedicated detectors measure the particle tracks and complex algorithms are required for track recognition and reconstruction. This CPU-intensive task is usually implemented as off-line software running on PC clusters. In this paper we present a system-on-chip design for the track recognition and reconstruction based on modern FPGA technologies. The basic principle of the algorithm is polled from software into the FPGA fabric. The fundamental architecture of the tracking processor is described in detail. Working as processing engines in compute nodes, the tracking processor contributes to recognize potential track candidates in real-time and promotes the selection efficiency of the data acquisition and trigger system. Our design study shows that the tracking module can be integrated in a single Xilinx Virtex-4 FX60 FPGA. The processing capability of the design is about 16.7K sub-events per second per module with our experimental setup, which achieves 20 times speedup compared to the software implementation.
  •  
7.
  • Lu, Zhonghai, et al. (författare)
  • Cluster-based simulated annealing for mapping cores onto 2D mesh networks on chip
  • 2008
  • Ingår i: 2008 IEEE Workshop On Design And Diagnostics Of Electronic Circuits And Systems, Proceedings. - 9781424422760 ; , s. 92-97
  • Konferensbidrag (refereegranskat)abstract
    • In Network-on-Chip (NoC) application design, core-to-node mapping is an important but intractable optimization problem. In the paper, we use simulated annealing to tackle the mapping problem in 2D mesh NoCs. In particular, we combine a clustering technique with the simulated annealing to speed up the convergence to near-optimal solutions. The clustering exploits the connectivity and distance relation in the network architecture as well as the locality and bandwidth requirements in the core communication graph. The annealing is cluster-aware and may be dynamically constrained within clusters. Our experiments suggest that simulated annealing can be effectively used to solve the mapping problem with a scalable size, and the combined strategy improves over the simulated annealing in execution time by up to 30% without compromising the quality of solutions.
  •  
8.
  •  
9.
  • Lu, Zhonghai, et al. (författare)
  • Network-on Chip Micro-Benchmarks
  • 2008
  • Ingår i: Embedded Systems Design. ; :September
  • Tidskriftsartikel (refereegranskat)abstract
    • The rapid development of Network-on-Chip (NoC) calls for a systematic approach to evaluate and fairly compare various NoC architectures. In this specification, we define a generic NoC architecture, a comprehensive set of synthetic workloads as micro-benchmarks, workload scenarios and evaluation criteria. These micro-benchmarks enable measuring particular properties of NoC architectures, complementing application benchmarks.
  •  
10.
  • Lu, Zhonghai, et al. (författare)
  • TDM virtual-circuit configuration for network-on-chip
  • 2008
  • Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems. - 1063-8210 .- 1557-9999. ; 16:8, s. 1021-1034
  • Tidskriftsartikel (refereegranskat)abstract
    • In network-on-chip (NoC), time-division-multiplexing (TDM) virtual circuits (VCs) have been proposed to satisfy the quality-of-service requirements of applications. TDM VC is a connection-oriented communication service by which two or more connections take turns to share buffers and link bandwidth using dedicated time slots. In the paper, we first give a formulation of the multinode VC configuration problem for arbitrary NoC topologies. A multinode VC allows multiple source and destination nodes on it. Then we address the two problems of path selection and slot allocation for TDM VC configuration. For the path selection, we use a backtracking algorithm to explore the path diversity, constructively searching the solution space. In the slot allocation phase, overlapped VCs must be configured such that no conflict occurs and their bandwidth requirements are satisfied. We define the concept of a logical network (LN) as an infinite set of associated (time slot, buffer) pairs with respect to a buffer on a given VC. Based on this concept, we develop and prove theorems that constitute sufficient and necessary conditions to establish conflict-free VCs. They are applicable for networks where all nodes operate with the same clock frequency but allowing different phases. Using these theorems, slot allocation for VCs is a procedure of assigning VCs to different LNs. TDM VC configuration can thus be predictable and correct-by-construction. Our experiments on synthetic and real applications validate the effectiveness and efficiency of our approach.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-10 av 19

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy